Classification of Machine Learning Engines using Latent Semantic Indexing

نویسندگان

  • Yuhanis Yusof
  • Taha Alhersh
  • Massudi Mahmuddin
  • Aniza Mohamed Din
چکیده

With the huge increase of software functionalities, sizes and application domain, the difficulty of categorizing and classifying software for information retrieval and maintenance purposes is on demand. This work includes the use of Latent Semantic Indexing (LSI) in classifying neural network and knearest neighborhood source code programs. Functional descriptors of each program are identified by extracting terms contained in the source code.Inaddition, information on where the terms are extracted from is also incorporated in the LSI. Based on the undertaken experiment, the LSI classifier is noted to generate a higher precision and recall compared to the C4.5 algorithm as provided in the Weka tool. Keywords—Latent Semantic Indexing, Software Classification; C4.5,Machine Learning Algorithms

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

A Novel Study for Summary/attribute Based Bug Tracking Classification Using Latent Semantic Indexing and Svd in Data Mining

This paper presentsa Latent Semantic Indexing (LSI) method for learningBug tracking concepts in document data. Each attribute in a vector provides the mark of participation of the document in data or term in the parallel concept .The objective to describe the concepts summary based, but to be capable to signify the documents and relations in a combined way for showing document-similarity, docum...

متن کامل

Support Vector Machines for Text Categorization Based on Latent Semantic Indexing

Text Categorization(TC) is an important component in many information organization and information management tasks. Two key issues in TC are feature coding and classifier design. In this paper Text Categorization via Support Vector Machines(SVMs) approach based on Latent Semantic Indexing(LSI) is described. Latent Semantic Indexing[1][2] is a method for selecting informative subspaces of featu...

متن کامل

Incorporating Latent Semantic Indexing into Spectral Graph Transducer for Text Classification

Spectral Graph Transducer(SGT) is one of the superior graph-based transductive learning methods for classification. As for the Spectral Graph Transducer algorithm, a good graph representation for data to be processed is very important. In this paper, we try to incorporate Latent Semantic Indexing(LSI) into SGT for text classification. Firstly, we exploit LSI to represent documents as vectors in...

متن کامل

How do some concepts vanish over time?

This paper presents the current stage of my PhD research focused on the use of machine learning in supporting the human learn from examples. I present here an approach to answer how some concepts change their contexts in time, using two techniques suitable for indexing and data mining: latent semantic indexing (LSI) and the APRIORI algorithm.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012